New Features for Automatic Text Independent Language Identification

نویسندگان

  • A. Nagesh
  • Kamakshi Prasad
چکیده

NEW FEATURES FOR AUTOMATIC TEXT INDEPENDENT LANGUAGE IDENTIFICATION A. Nagesh1 and V. Kamakshi Prasad2 1Mahatma Gandhi Institute of Technology, Hyderabad, India E-mail: [email protected] 2Jawaharlal Nehru Technological University, Hyderabad, India E-mail: [email protected] The objective of this paper is to explore new feature vectors for Automatic Text Independent Language Identification (LID) system. The task of the automatic language identification (LID) is to quickly and accurately identify the language being spoken. The basic requirement to build a LID system is feature vectors. Most of the existing LID systems are based on MFCC feature vectors. In this paper, an approach is proposed to derive a new type of feature vectors from speech signal for LID task. The variation in frequency of occurrence of phonemes is effectively captures by using Gaussians. The underlying probability function of Gaussian Mixture model (GMM) represents variations in the form of Gaussian probabilities. The new feature vectors are obtained by transforming 12 dimensional MFCC feature vectors to 15 dimensional feature vectors by considering the probabilities of Gaussian mixtures. This feature vectors are modeled using Hidden Markov model for LID. The experiments were carried out by varying the number of states and mixture components in HMM. The LID system is evaluated for 10 languages. The six languages identification performance is 100%. This approach outperformed the existing conventional MFCC feature vector based LID systems. We have demonstrated the automatic LID studies on OGI_MLT speech database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Automatic Fingerprint Classification Algorithm

Manual fingerprint classification algorithms are very time consuming, and usually not accurate. Fast and accurate fingerprint classification is essential to each AFIS (Automatic Fingerprint Identification System). This paper investigates a fingerprint classification algorithm that reduces the complexity and costs associated with the fingerprint identification procedure. A new structural algorit...

متن کامل

An Automatic Fingerprint Classification Algorithm

Manual fingerprint classification algorithms are very time consuming, and usually not accurate. Fast and accurate fingerprint classification is essential to each AFIS (Automatic Fingerprint Identification System). This paper investigates a fingerprint classification algorithm that reduces the complexity and costs associated with the fingerprint identification procedure. A new structural algorit...

متن کامل

Offline Language-free Writer Identification based on Speeded-up Robust Features

This article proposes offline language-free writer identification based on speeded-up robust features (SURF), goes through training, enrollment, and identification stages. In all stages, an isotropic Box filter is first used to segment the handwritten text image into word regions (WRs). Then, the SURF descriptors (SUDs) of word region and the corresponding scales and orientations (SOs) are extr...

متن کامل

ایجاز:یک سامانه عملیاتی برای خلاصه‌سازی تک‌سندی متون خبری فارسی

The rapid growth of published documents on the web has created some new requests for processing, classification and information retrieval. So, the use of natural language processing tools has increased around the world. Automatic summarization known as the core of a wide range of text-processing tools such as decision systems, accountability systems, search engines, etc. And always has been inv...

متن کامل

Author gender identification from text using Bayesian Random Forest

Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011